智能论文笔记

Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier Layers

Nurullah Sevim , Ege Ozan Özyedek , Furkan Şahinuç , Aykut Koç

分类：自然语言处理 | 人工智能

2022-09-26

基于变压器的语言模型利用注意机制在几乎所有自然语言处理（NLP）任务中进行大量绩效改进。在其他几个领域也广泛研究了类似的关注结构。尽管注意力机制可显着增强模型的性能，但其二次复杂性阻止了长序列的有效处理。最近的工作着重于消除计算效率低下的缺点，并表明基于变压器的模型仍然可以在没有注意力层的情况下达到竞争结果。一项开创性的研究提出了FNET，该研究将注意力层取代了变压器编码器体系结构中的傅立叶变换（FT）。 FNET通过消除注意机制的计算负担来加速训练过程，在加速训练过程的同时，实现了有关原始变压器编码器模型的竞争性能。但是，FNET模型忽略了FT的基本特性，可以利用经典信号处理，以进一步提高模型效率。我们提出了不同的方法，以有效地部署FT在变压器编码器模型中。我们提出的架构具有较少的模型参数，较短的培训时间，较少的内存使用情况以及一些额外的性能改进。我们通过对共同基准的广泛实验来证明这些改进。

translated by 谷歌翻译

Large-Scale Hate Speech Detection with Cross-Domain Transfer

Cagri Toraman , Furkan Şahinuç , Eyup Halit Yilmaz

分类：自然语言处理

2022-03-02

仇恨言语检测模型的性能取决于对模型的训练数据集。现有的数据集大部分是由有限数量的实例或定义仇恨主题的仇恨域准备的。这阻碍了关于仇恨领域的大规模分析和转移学习。在这项研究中，我们构建了大规模的推文数据集，以用英语和低资源语言（土耳其语）进行仇恨言论检测，每个人都由每个标签的100k推文组成。我们的数据集设计为在五个域上分布的推文数量相等。统计测试支持的实验结果表明，基于变压器的语言模型的表现优于传统词袋和神经模型的英语至少5％，而土耳其语则优于大规模仇恨言语检测。该性能也可扩展到不同的训练规模，在使用20％的培训实例时，将回收98％的英语表现和土耳其语的97％。我们进一步研究了仇恨领域之间跨域转移的概括能力。我们表明，其他英语域平均有96％的目标域性能恢复，而土耳其语为92％。性别和宗教更成功地概括到其他领域，而体育运动最大。

translated by 谷歌翻译

Building Segmentation on Satellite Images and Performance of Post-Processing Methods

Metehan Yalçın , Ahmet Alp Kindiroglu , Furkan Burak Bağcı , Ufuk Uyan , Mahiye Uluyağmur Öztürk

分类：计算机视觉

2022-12-28

Researchers are doing intensive work on satellite images due to the information it contains with the development of computer vision algorithms and the ease of accessibility to satellite images. Building segmentation of satellite images can be used for many potential applications such as city, agricultural, and communication network planning. However, since no dataset exists for every region, the model trained in a region must gain generality. In this study, we trained several models in China and post-processing work was done on the best model selected among them. These models are evaluated in the Chicago region of the INRIA dataset. As can be seen from the results, although state-of-art results in this area have not been achieved, the results are promising. We aim to present our initial experimental results of a building segmentation from satellite images in this study.

translated by 谷歌翻译

Structural State Translation: Condition Transfer between Civil Structures Using Domain-Generalization for Structural Health Monitoring

Furkan Luleci , F. Necati Catbas

分类：机器学习 | 人工智能

2022-12-28

Using Structural Health Monitoring (SHM) systems with extensive sensing arrangements on every civil structure can be costly and impractical. Various concepts have been introduced to alleviate such difficulties, such as Population-based SHM (PBSHM). Nevertheless, the studies presented in the literature do not adequately address the challenge of accessing the information on different structural states (conditions) of dissimilar civil structures. The study herein introduces a novel framework named Structural State Translation (SST), which aims to estimate the response data of different civil structures based on the information obtained from a dissimilar structure. SST can be defined as Translating a state of one civil structure to another state after discovering and learning the domain-invariant representation in the source domains of a dissimilar civil structure. SST employs a Domain-Generalized Cycle-Generative (DGCG) model to learn the domain-invariant representation in the acceleration datasets obtained from a numeric bridge structure that is in two different structural conditions. In other words, the model is tested on three dissimilar numeric bridge models to translate their structural conditions. The evaluation results of SST via Mean Magnitude-Squared Coherence (MMSC) and modal identifiers showed that the translated bridge states (synthetic states) are significantly similar to the real ones. As such, the minimum and maximum average MMSC values of real and translated bridge states are 91.2% and 97.1%, the minimum and the maximum difference in natural frequencies are 5.71% and 0%, and the minimum and maximum Modal Assurance Criterion (MAC) values are 0.998 and 0.870. This study is critical for data scarcity and PBSHM, as it demonstrates that it is possible to obtain data from structures while the structure is actually in a different condition or state.

translated by 谷歌翻译

Semi-Supervised Domain Adaptation for Semantic Segmentation of Roads from Satellite Images

Ahmet Alp Kindiroglu , Metehan Yalçın , Furkan Burak Bağcı , Mahiye Uluyağmur Öztürk

分类：计算机视觉

2022-12-26

This paper presents the preliminary findings of a semi-supervised segmentation method for extracting roads from sattelite images. Artificial Neural Networks and image segmentation methods are among the most successful methods for extracting road data from satellite images. However, these models require large amounts of training data from different regions to achieve high accuracy rates. In cases where this data needs to be of more quantity or quality, it is a standard method to train deep neural networks by transferring knowledge from annotated data obtained from different sources. This study proposes a method that performs path segmentation with semi-supervised learning methods. A semi-supervised field adaptation method based on pseudo-labeling and Minimum Class Confusion method has been proposed, and it has been observed to increase performance in targeted datasets.

translated by 谷歌翻译

Building Height Prediction with Instance Segmentation

Furkan Burak Bagci , Ahmet Alp Kindriroglu , Metehan Yalcin , Ufuk Uyan , Mahiye Uluyagmur Ozturk

分类：计算机视觉

2022-12-19

Extracting building heights from satellite images is an active research area used in many fields such as telecommunications, city planning, etc. Many studies utilize DSM (Digital Surface Models) generated with lidars or stereo images for this purpose. Predicting the height of the buildings using only RGB images is challenging due to the insufficient amount of data, low data quality, variations of building types, different angles of light and shadow, etc. In this study, we present an instance segmentation-based building height extraction method to predict building masks with their respective heights from a single RGB satellite image. We used satellite images with building height annotations of certain cities along with an open-source satellite dataset with the transfer learning approach. We reached, the bounding box mAP 59, the mask mAP 52.6, and the average accuracy value of 70% for buildings belonging to each height class in our test set.

translated by 谷歌翻译

Minimum Class Confusion based Transfer for Land Cover Segmentation in Rural and Urban Regions

Metehan Yalçın , Ahmet Alp Kındıroğlu , Furkan Burak Bağcı , Ufuk Uyan , Mahiye Uluyağmur Öztürk

分类：计算机视觉

2022-12-05

Transfer Learning methods are widely used in satellite image segmentation problems and improve performance upon classical supervised learning methods. In this study, we present a semantic segmentation method that allows us to make land cover maps by using transfer learning methods. We compare models trained in low-resolution images with insufficient data for the targeted region or zoom level. In order to boost performance on target data we experiment with models trained with unsupervised, semi-supervised and supervised transfer learning approaches, including satellite images from public datasets and other unlabeled sources. According to experimental results, transfer learning improves segmentation performance 3.4% MIoU (Mean Intersection over Union) in rural regions and 12.9% MIoU in urban regions. We observed that transfer learning is more effective when two datasets share a comparable zoom level and are labeled with identical rules; otherwise, semi-supervised learning is more effective by using the data as unlabeled. In addition, experiments showed that HRNet outperformed building segmentation approaches in multi-class segmentation.

translated by 谷歌翻译

PTSD in the Wild: A Video Database for Studying Post-Traumatic Stress Disorder Recognition in Unconstrained Environments

Moctar Abdoul Latif Sawadogo , Furkan Pala , Gurkirat Singh , Imen Selmi , Pauline Puteaux , Alice Othmani

分类：计算机视觉 | 机器学习

2022-09-28

创伤后应激障碍（PTSD）是一种长期衰弱的精神状况，是针对灾难性生活事件（例如军事战斗，性侵犯和自然灾害）而发展的。 PTSD的特征是过去的创伤事件，侵入性思想，噩梦，过度维护和睡眠障碍的闪回，所有这些都会影响一个人的生活，并导致相当大的社会，职业和人际关系障碍。 PTSD的诊断是由医学专业人员使用精神障碍诊断和统计手册（DSM）中定义的PTSD症状的自我评估问卷进行的。在本文中，这是我们第一次收集，注释并为公共发行准备了一个新的视频数据库，用于自动PTSD诊断，在野生数据集中称为PTSD。该数据库在采集条件下表现出“自然”和巨大的差异，面部表达，照明，聚焦，分辨率，年龄，性别，种族，遮挡和背景。除了描述数据集集合的详细信息外，我们还提供了评估野生数据集中PTSD的基于计算机视觉和机器学习方法的基准。此外，我们建议并评估基于深度学习的PTSD检测方法。提出的方法显示出非常有希望的结果。有兴趣的研究人员可以从：http：//www.lissi.fr/ptsd-dataset/下载PTSD-in-wild数据集的副本

translated by 谷歌翻译

Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia

Khanh Nguyen , Ali Furkan Biten , Andres Mafla , Lluis Gomez , Dimosthenis Karatzas

分类：计算机视觉

2022-09-21

人类利用先验知识来描述图像，并能够使其解释适应特定的上下文信息，即使在上下文信息和图像不匹配时，也可以在发明合理的解释的范围内。在这项工作中，我们提出了通过整合上下文知识来字幕Wikipedia图像的新颖任务。具体而言，我们制作的模型共同推理了Wikipedia文章，Wikimedia图像及其相关描述以产生上下文化的标题。特别是，可以使用类似的Wikimedia图像来说明不同的文章，并且所产生的标题需要适应特定的上下文，因此使我们能够探索模型的限制以调整标题为不同的上下文信息。该领域中的一个特殊挑战性的任务是处理量不多的单词和命名实体。为了解决这个问题，我们提出了一个预训练目标，掩盖了命名实体建模（MNEM），并表明与基线模型相比，此借口任务可以改善。此外，我们验证了Wikipedia中使用MNEM目标预先训练的模型可以很好地推广到新闻字幕数据集。此外，我们根据字幕任务的难度定义了两种不同的测试拆分。我们提供有关每种方式的作用和重要性的见解，并突出我们模型的局限性。接受时，代码，模型和数据拆分可公开可用。

translated by 谷歌翻译

Fine-grained Classification of Solder Joints with α-skew Jensen-Shannon Divergence

Furkan Ulger , Seniha Esen Yuksel , Atila Yilmaz , Dincer Gokcen

分类：计算机视觉

2022-09-20

焊接联合检查（SJI）是生产印刷电路板（PCB）的关键过程。在SJI期间发现焊料错误非常具有挑战性，因为焊接接头的尺寸很小，并且可能需要各种形状。在这项研究中，我们首先表明焊料的特征多样性低，并且可以作为精细颗粒的图像分类任务执行SJI，该任务侧重于难以固定的对象类。为了提高细粒度的分类精度，发现通过最大化熵来惩罚自信模型预测，在文献中很有用。与此信息内联，我们建议使用{\ alpha} -skew Jensen-Shannon Divergence（{\ alpha} -js）来惩罚模型预测的信心。我们将{\ alpha} -js正则化与现有基于熵指定的方法和基于注意机制，分割技术，变压器模型和特定损耗函数的方法进行比较。我们表明，在细化的焊料联合分类任务中，所提出的方法可以达到不同模型的F1得分和竞争精度。最后，我们可视化激活图，并表明，凭借熵的规范化，更精确的类歧视区域是局部的，这也更适合噪声。接受代码将在这里接受。

translated by 谷歌翻译